Goto

Collaborating Authors

 coupling coefficient





Capsule Networks Do Not Need to Model Everything

Renzulli, Riccardo, Tartaglione, Enzo, Grangetto, Marco

arXiv.org Artificial Intelligence

Capsule networks are biologically inspired neural networks that group neurons into vectors called capsules, each explicitly representing an object or one of its parts. The routing mechanism connects capsules in consecutive layers, forming a hierarchical structure between parts and objects, also known as a parse tree. Capsule networks often attempt to model all elements in an image, requiring large network sizes to handle complexities such as intricate backgrounds or irrelevant objects. However, this comprehensive modeling leads to increased parameter counts and computational inefficiencies. Our goal is to enable capsule networks to focus only on the object of interest, reducing the number of parse trees. We accomplish this with REM (Routing Entropy Minimization), a technique that minimizes the entropy of the parse tree-like structure. REM drives the model parameters distribution towards low entropy configurations through a pruning mechanism, significantly reducing the generation of intra-class parse trees. This empowers capsules to learn more stable and succinct representations with fewer parameters and negligible performance loss.


Hybrid Adaptive Modeling using Neural Networks Trained with Nonlinear Dynamics Based Features

Liu, Zihan, Kambali, Prashant N., Nataraj, C.

arXiv.org Artificial Intelligence

Accurate models are essential for design, performance prediction, control, and diagnostics in complex engineering systems. Physics-based models excel during the design phase but often become outdated during system deployment due to changing operational conditions, unknown interactions, excitations, and parametric drift. While data-based models can capture the current state of complex systems, they face significant challenges, including excessive data dependence, limited generalizability to changing conditions, and inability to predict parametric dependence. This has led to combining physics and data in modeling, termed physics-infused machine learning, often using numerical simulations from physics-based models. This paper introduces a novel approach that departs from standard techniques by uncovering information from nonlinear dynamical modeling and embedding it in data-based models. The goal is to create a hybrid adaptive modeling framework that integrates data-based modeling with newly measured data and analytical nonlinear dynamical models for enhanced accuracy, parametric dependence, and improved generalizability. By explicitly incorporating nonlinear dynamic phenomena through perturbation methods, the predictive capabilities are more realistic and insightful compared to knowledge obtained from brute-force numerical simulations. In particular, perturbation methods are utilized to derive asymptotic solutions which are parameterized to generate frequency responses. Frequency responses provide comprehensive insights into dynamics and nonlinearity which are quantified and extracted as high-quality features. A machine-learning model, trained by these features, tracks parameter variations and updates the mismatched model. The results demonstrate that this adaptive modeling method outperforms numerical gray box models in prediction accuracy and computational efficiency.


Initialization Method for Factorization Machine Based on Low-Rank Approximation for Constructing a Corrected Approximate Ising Model

Seki, Yuya, Nakada, Hyakka, Tanaka, Shu

arXiv.org Artificial Intelligence

This paper presents an initialization method that can approximate a given approximate Ising model with a high degree of accuracy using a factorization machine (FM), a machine learning model. The construction of an Ising models using an FM is applied to black-box combinatorial optimization problems using factorization machine with quantum annealing (FMQA). It is anticipated that the optimization performance of FMQA will be enhanced through an implementation of the warm-start method. Nevertheless, the optimal initialization method for leveraging the warm-start approach in FMQA remains undetermined. Consequently, the present study compares initialization methods based on random initialization and low-rank approximation, and then identifies a suitable one for use with warm-start in FMQA through numerical experiments. Furthermore, the properties of the initialization method by the low-rank approximation for the FM are analyzed using random matrix theory, demonstrating that the approximation accuracy of the proposed method is not significantly influenced by the specific Ising model under consideration. The findings of this study will facilitate advancements of research in the field of black-box combinatorial optimization through the use of Ising machines.


Causal Discovery and Classification Using Lempel-Ziv Complexity

Dhruthi, null, Nagaraj, Nithin, B, Harikrishnan N

arXiv.org Artificial Intelligence

Inferring causal relationships in the decision-making processes of machine learning algorithms is a crucial step toward achieving explainable Artificial Intelligence (AI). In this research, we introduce a novel causality measure and a distance metric derived from Lempel-Ziv (LZ) complexity. We explore how the proposed causality measure can be used in decision trees by enabling splits based on features that most strongly \textit{cause} the outcome. We further evaluate the effectiveness of the causality-based decision tree and the distance-based decision tree in comparison to a traditional decision tree using Gini impurity. While the proposed methods demonstrate comparable classification performance overall, the causality-based decision tree significantly outperforms both the distance-based decision tree and the Gini-based decision tree on datasets generated from causal models. This result indicates that the proposed approach can capture insights beyond those of classical decision trees, especially in causally structured data. Based on the features used in the LZ causal measure based decision tree, we introduce a causal strength for each features in the dataset so as to infer the predominant causal variables for the occurrence of the outcome.


Dynamic Routing Between Capsules

Sara Sabour, Nicholas Frosst, Geoffrey E. Hinton

Neural Information Processing Systems

A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or an object part. We use the length of the activity vector to represent the probability that the entity exists and its orientation to represent the instantiation parameters.


Collective Learning by Ensembles of Altruistic Diversifying Neural Networks

Brazowski, Benjamin, Schneidman, Elad

arXiv.org Machine Learning

Combining the predictions of collections of neural networks often outperforms the best single network. Such ensembles are typically trained independently, and their superior `wisdom of the crowd' originates from the differences between networks. Collective foraging and decision making in socially interacting animal groups is often improved or even optimal thanks to local information sharing between conspecifics. We therefore present a model for co-learning by ensembles of interacting neural networks that aim to maximize their own performance but also their functional relations to other networks. We show that ensembles of interacting networks outperform independent ones, and that optimal ensemble performance is reached when the coupling between networks increases diversity and degrades the performance of individual networks. Thus, even without a global goal for the ensemble, optimal collective behavior emerges from local interactions between networks. We show the scaling of optimal coupling strength with ensemble size, and that networks in these ensembles specialize functionally and become more `confident' in their assessments. Moreover, optimal co-learning networks differ structurally, relying on sparser activity, a wider range of synaptic weights, and higher firing rates - compared to independently trained networks. Finally, we explore interactions-based co-learning as a framework for expanding and boosting ensembles.


Training Deep Capsule Networks

Peer, David, Stabinger, Sebastian, Rodriguez-Sanchez, Antonio

arXiv.org Machine Learning

The capsules of Capsule Networks are collections of neurons that represent an object or part of an object in a parse tree. The output vector of a capsule encodes the so called instantiation parameters of this object (e.g. position, size, or orientation). The routing-by-agreement algorithm routes output vectors from lower level capsules to upper level capsules. This iterative algorithm selects the most appropriate parent capsule so that the active capsules in the network represent nodes in a parse tree. This parse tree represents the hierarchical composition of objects out of smaller and smaller components. In this paper, we will show experimentally that the routing-by-agreement algorithm does not ensure the emergence of a parse tree in the network. To ensure that all active capsules form a parse tree, we introduce a new routing algorithm called dynamic deep routing. We show that this routing algorithm allows the training of deeper capsule networks and is also more robust to white box adversarial attacks than the original routing algorithm.